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FIG. 7 gives a more detailed description of a preferred 
implementation of step 60 of FIG. 6, subject to the constraint 
of symmetry, just given. Where 2q communities are to be 
identified, the method of the invention may conveniently be 
implemented to follow the symmetry. 

The definition of the 2q communities proceeds as a series 
of q iterations of a sequence of steps, where each iteration 
produces two communities, one for pages with large coor- 
dinates and one for pages with small coordinates. Here, 
"smallest" is taken to mean most negative, i.e., having the 
largest negative magnitude. 

The iteration is expressed, in step 62, as a FOR loop to be 
executed q number of times. 

In step 64, a community, indexed as community 2/-1 (for 
l<i<q), is defined, by choosing k pages with largest coor- 
dinates in the vector H[i] as hubs (step 66), and choosing k 
pages with largest coordinates in the vector H[i] as hubs 
(step 66). 

Next, in step 70, a community, indexed as community 2/ 
(for l<i<q), is defined, by choosing k pages with smallest 
coordinates in the vector H[i] as hubs (step 66), and choos- 
ing k pages with smallest coordinates in the vector H[i] as 
hubs (step 66). 

This completes an iteration. In successive iterations, the 
pages with progressively smaller coordinates are used to 
define the hubs and authorities for odd-indexed 
communities, and the pages with progressively larger coor- 
dinates are used to define the hubs and authorities for 
even-indexed communities, until all 2q of the communities 
have been defined. 

MATHEMATICAL INTERPRETATION 

The discussion which follows will present a somewhat 
more theoretical treatment of the concepts relating to hubs 
and authorities, which have been discussed above. Certain 
aspects of the discussion may be foreseen from the existing 
literature. 

The hub and authority vectors H and A correspond to the 
principal eigenvectors of two matrices associated with the 
set of pages. 

Let M denote the matrix whose (ij) entry gives the 
number of pages that point to both page i and page j. Let N 
denote the matrix whose (i j) entry gives the number of 
pages that are pointed to by both page i and page j. 

Then the above iterative procedures are in fact an imple- 
mentation of the power iteration method, given in G. Golub, 
C. F. Van Loan, "Matrix Computations", Johns Hopkins 
University Press, 1989, p. 351, for computing the principal 
eigenvectors of the matrices M and N. 

In particular, the authority vector A is the principal 
eigenvector of M, and the hub vector H is the principal 
eigenvector of N. 

The additional vectors A, and H, correspond to non- 
principal eigenvectors of M and N respectively. The use of 
such eigenvectors for clustering is known as spectral 
partitioning, and has been studied as a graph algorithm. See, 
for instance, D. Spielman, S. Teng, "Spectral partitioning 
works: Planar graphs and finite -element meshes," Proceed- 
ings of the 37th IEEE Symposium on Foundations of 
Computer Science, 1996. 

The entries of the matrices M and N correspond to 
co-citation and bibliographic coupling, which have been 
studied in the bibliometric literature. 

Since the algorithm works on an arbitrary set of linked 
pages, it is worth noting that it can be run in a query- 
independent fashion. In particular, given a set of pages, the 
identification of hubs and authorities among them gives a 
method for automatically determining the topic that best 
"fits" the set of pages. 
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SUMMARY 

Using the foregoing specification, the invention may be 
implemented using standard programming and/or engineer- 
ing techniques using computer programming software, 
firmware, hardware or any combination or subcombination 
thereof. Any such resulting program(s), having computer 
readable program code means, may be embodied or pro- 
vided within one or more computer readable or usable media 
such as fixed (hard) drives, disk, diskettes, optical disks, 
magnetic tape, semiconductor memories such as read-only 
memory (ROM), etc., or any transmitting/receiving medium 
such as the Internet or other communication network or link, 
thereby making a computer program product, i.e., an article 
of manufacture, according to the invention. The article of 
manufacture containing the computer programming code 
may be made and/or used by executing the code directly 
from one medium, by copying the code from one medium to 
another medium, or by transmitting the code over a network. 

An apparatus for making, using, or selling the invention 
may be one or more processing systems including, but not 
limited to, a central processing unit (CPU), memory, storage 
devices, communication links, communication devices, 
servers, I/O devices, or any subcomponents or individual 
parts of one or more processing systems, including software, 
firmware, hardware or any combination or subcombination 
thereof, which embody the invention as set forth in the 
claims. 

User input may be received from the keyboard, mouse, 
pen, voice, touch screen, or any other means by which a 
human can input data to a computer, including through other 
programs such as application programs. 

One skilled in the art of computer science will easily be 
able to combine the software created as described with 
appropriate general purpose or special purpose computer 
hardware to create a computer system and/or computer 
subcomponents embodying the invention and to create a 
computer system and/or computer subcomponents for car- 
rying out the method of the invention. While the preferred 
embodiment of the present invention has been illustrated in 
detail, it should be apparent that modifications and adapta- 
tions to that embodiment may occur to one skilled in the art 
without departing from the spirit or scope of the present 
invention as set forth in the following claims. 

While the preferred embodiments of the present invention 
have been illustrated in detail, it should be apparent that 
modifications and adaptations to those embodiments may 
occur to one skilled in the art without departing from the 
scope of the present invention as set forth in the following 
claims. 

bat is claimed is: 
1. A aWputer program product, for use with a computer 
ystem, Rtf directing the computer system to execute a 
arch of \nformation resources, the resources having 
contenl-base\i links between each other, to identify a desired 
subset of the reformation resources which satisfy a desired 
criterion, the computer program product comprising: 
a computer-readable medium; 

means, provided on the recording medium, for directing 
the computer system to identify an initial set of infor- 
mation resourc 

means, provided oV the recording medium, for directing 
the computer sys\em to define initial authoritativeness 
information for thV initial set; 

means, provided on trie recording medium, for directing 
the computer systemNlo use the initial authoritativeness 
information as input authoritativeness information, to 
execute the steps of: 
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(i) producing first authoritativeness information about a 
sehof information resources pointed to by links in 
resources of the input set, and 

(ii) producing second authoritativeness information 
abouKa set of information resources having links that 
point id resources of the input set; and 

means, provided on the recording medium, for directing 
the computer system to produce a final set of informa- 
tion resources based on the first and second authorita- 
tiveness information. 

2. A computer Vogram product as recited in claim 1, 
wherein the informaVion resources include World Wide Web 
pages, and the contertl-based links include hyperlinks. 

3. A computer program product as recited in claim 1, 
wherein the means for directing to identify an initial set of i 5 
information resources mcludes means, provided on the 
recording medium, for directing the computer system to 
obtain, as an input, an reformation resource containing 
stifeject matter of interest. 

^ \\ 4. A^computer program product as recited in claim 3, 20 
\ wherein themcans for directing to identify an initial set of 
^information resources includes means, provided on the 
recording medium, for^tikecting the computer system to 
identify a further set of mforTrhajon resources linked to the 
input information resource. >v 
aiJPs. 5. A^computer program product asHQcited in claim 1, 
\p> ^herein: 

* the means for directing to execute the steps of producing 
first anei second authoritativeness information is opera- 
tive in a\series of iterations; 
the initial authoritativeness information is used as input 
authoritativeness information for a first iteration; and 
the produced first and second authoritativeness infor- 
mation is a tesult of the iteration, the first and second 
authoritativertess information produced in a given itera- 35 
tion to be used as the input authoritativeness informa- 
tion for the next iteration. 
6. A computer program product as recited in claim 1 
further comprising means, provided on the recording 
medium, for directing \he computer system to execute the 40 
steps of producing first\ authoritativeness information and ^ 
producing second authorkativeness information in a series 
sf iterations until a predetermined condition is met. 
. y. A computer program product as recited in claim 6, 

^ wherein the predetermined condition includes the execution 45 

|>C \ of a specified number of iterations. 

' 8. A^computer program product as recited in claim 6, 

wherein th^ predetermined condition includes a steady state 
in which further iterations result in substantially the same 
results 

9. A compute\t)rogram product as recited in claim 6 
wherein the means^br directing to identify an initial set of 
information resource^ includes means, provided on the 
recording medium, fondirecting the computer system to 
execute a keyword-based emery search, results of the search 55 
including information resources to be included in the initial 
set. 

10. A computer program propel as recited in claim 9, 
wherein the means for directing tcVudentify an initial set of 
information resources further induces means, provided on 
the recording medium, for directing th^computer system to 
identify information resources linked to\or from the infor- 
mation resources which arc the results of the search, the 
former information resources also to be included in the 
initial set. 

11. A computer program product as recited nV claim 10, 
wherein the means for directing to define initial a^thorita- 
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tiveness information includes means, provided on the 
.ecording medium, for directing the computer system to 
V ^ct an initial numerical authoritativeness value for each of 
the\nformation resources of the initial set. 

12\A computer program product as recited in claim 11, 
wherelta the means for directing to define initial authorita- 
tive ness\nf or ma tion further includes means, provided on the 
recordingSmedium, for directing the computer system to 
lefine an a\ahority value and a hub value for each of the 
Information resources of the initial set. 

13. A compiler program product as recited in claim 12, 
wherein the denned authority values and hub values are 
processed as vectors, each vector containing a respective 
term corresponding with each respective one of the infor- 
mation resources olMhe initial set, and having stored therein 
the value defined foXthat respective one of the information 
resources of the initial set. 

14. A computer program product as recited in claim 12, 
wherein: 

an initial hub value is\defined as 1 if the information 
resource was foundry the keyword-based query 
search, and 0 if the rofofmation resource is linked to or 
from the information resources which are the results of 
the search; and 

an initial authority value is defined as 0 for all information 
resources. 

15. A computer program produc\as recited in claim 12, 
wherein, for each iteration: 

the hub value for an information resource is updated as the 
sum of the authority values for authority information 
resources which point to the hub in forma tion resource; 
and 

the authority value for an information resource is updated 
as the sum of the hub values for hufV information 
resources which are pointed to by the \nformation 
resource. 

16. A computer program product as recited inVlaim 15, 
wherein each iteration further includes normalizing, the hub 
and authority values for the information resources. 

17. A computer program product as recited in claim 1, 
ierein\the means for directing to produce a final set of 

inform atlbn resources includes means, provided on\the 
recordingWedium, for directing the computer system\to 
t information resources from the set based on their hub 
and Irajhority values. 

18. AWpputer program product as recited in claim 17, 
wherein theNneans for directing to select includes means, 
provided on tnesxecording medium, for directing the com- 
puter system to Sdect information resources whose hub 
values or authority vnjues have greatest magnitudes. 

19. A computer program product as recited in claim 17, 
wherein the means for directing to select includes means, 
provided on the recording me^um, for directing the com- 
puter system to select a plurahty ctt^uccessive communities, 
selecting each successive commumtyincluding selecting 
information resources whose hub valuesN^authority values 
have greatest magnitudes of those informatiohsfesources not 
already selected for a prior community. 

10. A imethod for executing a search of information 
refcourcesAhe resources having content-based links between 
each otherXto identify a desired subset of the information 
resources winch satisfy a desired criterion, the method 
comprising tne steps of: 

identifyingVm initial set of information resources; 
defining iniljal authoritativeness information for the ini- 
tial set; 
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using the initial authoritativeness information as input 
authoritativeness information, executing the steps of: 

(i) producing first authoritativeness information about a 
set of information resources pointed to by links in 
resoiirces of the input set, and 

(ii) prouucing second authoritativeness information 
about q set of information resources having links that 
point !b resources of the input set; and 

producing a\ final set of information resources based on 
the first and second authoritativeness information. 

21. A method as recited in claim 20, wherein the infor- 
mation resources include World Wide Web pages, and the 
content-based links include hyperlinks. 

22. A method afe recited in claim 20, wherein the step of 
identifying an initrol set of information resources includes 
obtaining, as an input, an information resource containing 
subject matter of interest. 

23. A method as recited in claim 22, wherein the step of 
identifying an initial set of information resources includes 
identifying a further set of information resources linked to 
the input information resource. 

24. A method as recited in claim 20, wherein: 
the\step of executing the steps of producing first and 

:ond authoritativeness information is executed in a 
series of iterations; 
the initial authoritativeness information is used as input 
authoritativeness information for a first iteration; and 
the produced first and second authoritativeness informa- 
tion isV result of the iteration, the first and second 
authoritaHiveness information produced in a given itera- 
tion to beVsed as the input authoritativeness informa- 
tion for theVnext iteration. 

25. A method A recited in claim 20, wherein the steps of 
producing first autnforitativeness information and producing 
second authoritativeness information are executed in a series 
of iterations until a predetermined condition is met. 

v 26. A method as recited in claim 25, wherein the prede- 
termined condition includes the execution of a specified 
nutaber of iterations. 

2*JL A method as recited in claim 25, wherein the prede- 
termined condition includes a steady state in which further 
iterations result in substantially the same results. 

28. A naethod as recited in claim 25, wherein the step of 
identifyin^m initial set of information resources includes 
executing a\keyword-based query search, results of the 
search including information resources to be included in the 
initial set. \ 

29. A method as recited in claim 28, wherein the step of 
identifying an initial set of information resources further 
includes identifying Wormation resources linked to or from 
the information resources which are the results of the search, 
the former information resources also to be included in the 
initial set. \ 

30. A method as recited i\claim 29, wherein the step of 
defining initial authoritativeness information includes 
selecting an initial numerical authoritativeness value for 
each of the information resources\rf the initial set. 

31. A method as recited in claim a0, wherein the step of 
defining initial authoritativeness information further 
includes defining an authority value and a>hub value for each 
of the information resources of the initial set. 

32. A method as recited in claim 31, whereon the defined 
authority values and hub values are processe\as vectors, 
each vector containing a respective term corresponding with 
each respective one of the information resourced of the 
initial set, and having stored therein the value defined for 
that respective one of the information resources of the initial 
set. \ 
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method as recited in claim 31, wherein: 

hub value is defined as 1 if the information 
resource was found by the keyword-based query 
searck and 0 if the information resource is linked to or 
from thNt information resources which are the results of 
the search: and 
an initial authority value is defined as 0 for all information 
resources. 

34. A method as raited in claim 31, wherein, for each 
iteration: 

the hub value for an raiWmation resource is updated as the 
sura of the authority \alues for authority information 
resources which point to\he hub information resource; 
and 

the authority value for an information resource is updated 
as the sum of the hub values for hub information 
resources which are pointed tb^by the information 
resource. 

35. A method as recited in claim wherein each 
20 iteration further includes normalizing the h\b and authority 

values for the information resources. 

36. AXmethod as recited in claim 20, wherelj 
each information resource is associated with zhj authority 

valuevtnd a hub value; and 
ths^tep of producing a final set of information resources 
des\ selecting information resources f 
basec\on\the hub and authority values. 

37. A method as recited in claim 36, wherein the i 
selecting includes selecting information resources wh 
hub values or autn^rity values have greatest magnitudes. 1 

38. A method as rented in claim 36, wherein the step of > 
selecting includes seating a plurality of successive 
communities, selecting eao 
ing selecting information 

authority values have greatest Magnitudes of those informa- 
tion resources not already selected for a prior community. 
A system for executing a 
sources, the resources having contenl^tosed links between 
each omer, to identify a desired subset ot^the information 
resources which satisfy a desired criterion, tne^ystem com- 
prising: 

means\for identifying an initial set of information 
resources; 

means Apr defining initial authoritativeness information 

for thd initial set; 
means fo\ using the initial authoritativeness information 
as inpu\ authoritativeness information, to execute the 
steps of:\ 

(i) producing first authoritativeness information about a 
set of information resources pointed to by links in 
resource!* of the input set, and 

(ii) producmg second authoritativeness information 
about a setY>f information resources having links that 
point to resources of the input set; and 

means for producing a final set of information resources 
based on the fir\ 
mation. 

40. A system as recced in claim 39, wherein the infor- 
mation resources include World Wide Web pages, and the 
content-based links include hyperlinks. 

41. A system as recited In claim 39, wherein the means for 
identifying an initial set o\ 
means for obtaining, as an\npul, an information resource 

65 containing subject matter of Viterest. 

42. A system as recited in claim 41, wherein the means for 
identifying an initial set of information resources includes 
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means for identifying a further set of information resources 
linked to the input information resource. 

system as recited in claim 39, wherein: 
mitans for executing the steps of producing first and 
second authoritativeness information is operative in a 
seriek of iterations; 
the initial authoritativeness information is used as input 
authoriWiveness information for a first iteration; and 
the produced first and second authoritativeness informa- 
tion is aVesult of the iteration, the first and second 
authoritativeness information produced in a given itera- 
tion to be ilsed as the input authoritativeness informa- 
tion for the next iteration. 

44. A system a\ recited in claim 39 further comprising 
m\ans for executing the steps of producing first authorita- 
tiveness iriformationVnd producing second authoritativeness 
infofmation in a series of iterations until a predetermined 
condition is met 

45. A system as recited in claim 44, wherein the prede- 
terminecl condition includes the execution of a specified 
number of iterations. 

46. A system as recited in claim 44, wherein the prede- 
termined condition includes a steady state in which further 
iterations result in substantially the same results. 

47. A system as recited in claim 44, wherein the means for 
identifying an initial set of information resources includes 
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fl. A system as recited in claim 50, wherein the defined 
authority values and hub values are processed as vectors, 
each vector containing a respective term corresponding with 
each respective one of the information resources of the 
initia} set, and having stored therein the value defined for 
ipective one of the information resources of the initial 
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52. Asystem as recited in claim 50, wherein: 
an initial hub value is defined as 1 if the information 

ce was found by the keyword-based query 
searchV and 0 if the information resource is linked to or 
from thV information resources which are the results of 
the searffli; and 
an initial autoority value is defined as 0 for all information 
resources.^ 

53. A system\as recited in claim 50, wherein, for each 
iteration: 

the hub value foram information resource is updated as the 
sum of the authority values for authority information 
resources whichVoinUo the hub information resource; 
and 

the authority value foV an information resource is updated 
as the sum of the\hub values for hub information 
resources which are\ pointed to by the information 
resource. 

54. Asystem as recited in)riaim 53, wherein each iteration 
further includes normalizing t\e hub and authority values for 



of the search including information resources to be included 
in the initial set. 

48. Asystem as reciteoun claim 47, wherein the means for 
identifying an initial set\f information resources further 
includes means for identifying information resources linked 
to or from the information resources which are the results of 
the search, the former infornVlion resources also to be 
included in the initial set. 

49. Asystem as recited in claim wherein the means for 
defining initial authoritativeness information includes means 
for selecting an initial numerical authNoritativeness value for 
each of the information resources of tte initial set. 

50. Asystem as recited in claim 49, wRerein the means for 
defining initial authoritativeness information further 
includes means for defining an authority value and a hub 
value for each of the information resources oNhe initial set. 
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39, wherein the means for 
resources includes 
s from the set based 

on their hVb and authority values. 

56. Asystem as recited in claim 55, Vherein the means for 
selecting includes means for selecting information resources 
whose hub values or authority values ha\e greatest magni- 
tudes. 

57. Asystem as recited in claim 55, whereVn the means for 
selecting includes means for selecting a plurality of succes- 
sive communities, selecting each successive community 
including selecting information resources whosV hub values 
or authority values have greatest magnitudes of those infor- 
mation resources not already selected for a prior community. 
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